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ABSTRACT 


~  An  auxiliary  service  unit  is  normally  idle,  or  in  cold 
standby.  If  a  demand  for  the  unit's  service  occurs,  the  unit 
must  be  available  to  satisfy  it,  or  else  "catastrophe"  occurs. 
Policies  for  periodic  inspection  and  maintenance  of  such  a  unit 
are  derived  in  this  paper  that  maximize  the  expected  time  until  a 
catastrophe  occurs.  The  policies  recognize  that  inspection, 
maintenance,  and  repair  periods  are  of  non-zero  duration,  during 
which  the  unit  is  vulnerable.  They  also  account  for  the  possi¬ 
bility  of  hazardous  inspection  that  may  damage  the  unit,  and 
various  forms  of  imperfect  repair. 

Important  examples  occur  in  the  nuclear  power  industry:  a 
unit  may  be  a  pump,  or  emergency  diesel  generator,  ana  a  demand 
may  be  caused  by  an  initiating  event  such  as  pipe  break  or  loss 
of  off  site  power;  "catastrophe"  equates  to  loss-of  coolant  acci¬ 
dent  or  melt  down.  Other  examples  occur  in  the  military,  and  in 
emergency  services  to  hospitals. 

Key  words:  Reliability,  availability,  maintenance,  time  to 

failure,  inspection,  Markov  decision  process,  nuclear 
safety,  standby  redundancy. 


1.  INTRODUCTION 


It  is  common  practice  to  improve  the  reliability  of  a  system 
by  installing  cold  standby  units,  which  are  only  brought  into 
operation  when  a  standard  operating  system  fails.  In  particu¬ 
lar,  diesel  generators  in  cold  standby  may  be  used  to  scram  a 
reactor  in  case  of  a  coolant  pipe  breaking  or  some  other  failure 
in  a  nuclear  power  plant.  Other  examples  occur  in  hospital 
power  supplies  and  military  hardware.  If  such  a  standby  system 
fails  to  operate  when  it  is  required,  then  the  consequences  could 
be  catastrophic.  The  times  when  there  is  a  need  for  the  standby 
unit  are  called  initiating  events.  If  the  standby  system  is  in 
a  failed  state,  when  an  initiating  event  occurs,  then  a  catas¬ 
trophic  event  is  said  to  occur. 

It  is  necessary  to  inspect  and  maintain  the  standby  system 
from  time  to  time.  If  inspection  reveals  it  to  be  in  an  unsatis¬ 
factory  state,  repairs  are  made.  The  idea  is  that  the  standby 
unit  can  go  down  even  when  it  is  not  operating  and  this  will 
cause  it  to  fail  to  operate  the  next  time  it  is  needed. 

The  following  policy  has  been  proposed  for  the  inspection  of 
diesel  generators  in  a  reactor.  After  a  generator  is  found  to 
be  down  on  inspection  and  is  repaired,  it  undergoes  K  inspections 
at  short  intervals  of  time.  If  it  is  found  to  be  up  at  each  of 
these  short  inspections,  then  it  is  inspected  at  long  intervals 
thereafter  until  it  is  found  to  be  down.  Whenever  a  generator 
is  found  to  be  down  and  is  repaired,  inspections  start  with  the 
K  short  inspection  intervals  again.  This  type  of  inspection 
policy  reflects  the  idea  that  after  the  system  is  repaired  it 
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should  be  inspected  more  often  for  awhile  to  ensure  it  was  re¬ 
paired  correctly.  In  Section  2  we  present  a  model  for  this 
inspection  policy  and  derive  an  expression  for  the  expected  time 
to  a  catastrophic  event. 

In  Sections  3  through  5  we  will  use  various  Markov  decision 
and  renewal  theoretic  formulations  of  the  problem  to  investigate 
the  forms  of  the  optimal  inspection  policies  which  maximize  the 
expected  time  until  a  catastrophic  event  occurs.  This  will  show 
us  how  certain  assumptions  about  inspection  and  repair  of  the 
standby  system  affect  the  form  of  the  inspection  policy. 

Almost  all  the  previous  work  on  inspecting  a  single  standby 
unit  uses  a  cost  criterion.  Barlow  and  Proschan  [2]  described 
the  basic  average  cost  per  unit  time  model  with  accurate  instan¬ 
taneous  inspection  and  faultless  repair,  while  Luss  and  Kander 
[9]  allowed  for  non- zero  inspection  times.  Wattanapanom  and 
Shaw  [20]  studied  the  problem  when  inspection  is  hazardous,  so 
that  it  is  possible  for  the  inspection  to  cause  the  unit  to  fail. 
Nakagawa  [11]  looked  at  the  probability  that  at  an  initiating 
event  the  standby  system  will  work,  while  Butler  [3]  maximized 
the  expected  lifetime  of  the  standby  unit,  but  did  not  allow  re¬ 
pairs.  His  model  allowed  the  standby  unit  to  be  in  more  than  one 
'up'  state,  which  are  distinguishable  only  upon  inspection.  This 
connects  with  the  work  on  partially  observable  Markov  decision 
processes  [1,10,16]  ,  and  in  particular  the  problem  of  optimal 
inspection  and  repair  of  a  deteriorating  process  with  imperfect 
information  introduced  by  Ross  [13]  and  generalized  by  White  [21] , 
Rosenfield  ti2l #  Luss  [8],  Sengupta  [15],  Suzuki  [17],  and  Wong 


[19].  In  these  papers,  a  system  can  be  in  more  than  one  state, 
but  which  one  is  known  only  imperfectly  or  upon  inspection. 

Our  models  of  the  inspection  and  repair  of  the  standby  sys¬ 
tem  allow  for  non-zero  inspection-maintenance  times  and  non-zero 
repair  periods,  but  we  ignore  the  time  the  unit  is  in  use.  The 
idea  is  that  during  inspection-maintenance  and  repair  the  unit 
can  not  react  to  an  initiating  event  and  so  these  are  critical 
times  for  the  system,  whereas  we  make  the  assumption  that  the 
time  the  standby  system  is  actually  in  use  is  so  small  it  can  be 
neglected.  We  also  allow  for  imperfect  repair  and  hazardous 
inspection,  so  that  even  if  the  unit  is  up  on  inspection,  it 
might  be  down  immediately  after.  Thus  we  explicitly  represent 
possible  mistakes  in  inspection,  and  allow  for  incorrectly  iden¬ 
tifying  the  unit  as  working  when  in  fact  it  was  down.  Another 
model  considered  allows  the  unit  to  be  in  one  of  two  'up'  states, 
which  are  indistinguishable  on  inspection,  but  have  different 
failure  rates.  This  is  intended  to  incorporate  the  idea  that  a 
repair  might  put  right  the  superficial  cause  of  the  unit's  failure, 
but  not  deal  with  the  underlying  problem,  which  will  recur. 

In  Section  3,  we  introduce  our  basic  discrete  time  models 
where  the  unit  can  only  be  either  'up'  or  'down'.  The  times 
between  initiating  events  are  assumed  to  have  a  geometric  distri¬ 
bution.  We  describe  the  case  where  successfully  dealt  with 
initiating  events  are  recorded  as  showing  the  unit  was  working 
at  that  time.  By  modelling  this  as  a  Markov  decision  process 
we  can  find  the  form  of  the  optimal  inspection  policy  to  maximize 
expected  time  to  a  catastrophic  event.  We  compare  this  with  the 
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case  where  we  ignore  any  information  from  successfully  dealt 
with  initiating  events .  We  also  look  at  the  expected  times  until 
a  catastrophic  event  under  different  policies,  and  optimize  the 
probability  that  the  system  will  last  at  least  a  fixed  number 
of  time  periods.  Section  4  describes  the  equivalent  continuous 
time  model  and  shows  how  the  discrete  time  results  are  replicated 
if  the  lifetime  of  the  unit  is  exponential  and  the  initiating 
events  occur  according  to  a  Poisson  process.  We  also  investi¬ 
gate  the  optimal  inspection  policy  for  general  lifetime  distribu¬ 
tions.  Section  5  generalizes  the  discrete  time  model  to  allow 
the  unit  to  be  in  two  'up'  states.  In  certain  cases  the  optimal 
inspection  policy  for  this  model  has  quite  short  inspection 
periods  immediately  after  a  repair,  which  then  lengthen  as 
further  inspections  suggest  the  system  is  in  the  "better"  up 
state. 
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CONTINUOUS  TIME  MODEL  WITH  TWO-UP  STATES  AND  SHORT- LONG 
INSPECTION  POLICY 


Assume  the  system  can  be  in  one  of  two  up-states  j  -  1,2 
until  it  fails.  The  two  up-states  are  indistinguishable  upon 
inspection.  After  a  repair  the  system  goes  to  up-state  j  with 

probability  tt_.  and  remains  there  until  it  fails.  After  a  repair 

the  conditional  distribution  of  the  time  to  failure  given  it  is 
in  up-state  j  is  G.,  independent  of  the  past. 

After  a  repair  the  system  is  inspected  and  maintained  at  K 

short  intervals  of  length  S.  If  the  system  is  found  to  be  up 

at  each  of  the  K  short  inspection  intervals,  then  future  inspec¬ 
tions  occur  at  long  intervals  of  length  L  >  S.  If  the  system  is 
found  to  be  down  upon  inspection,  it  is  repaired  and  then  in¬ 
spected  at  K  short  inspection  intervals  again  before  the  long 
inspection  intervals  begin.  If  the  .system  is  found  to  be  up 
upon  inspection,  routine  maintenance  is  performed.  Given  the 
system  is  in  up-state  j,  the  conditional  distribution  of  the 
time  to  failure  after  an  inspection  is  F^ ,  independent  of  the 
past.  Some  reasonable  and  tractable  examples  of  distributions 
Fj  and  G^  are  the  exponential,  and  the  exponential  with  a  proba¬ 
bility  atom  at  the  origin  reflecting  hazardous  inspection  or 
faulty  repair. 

Inspection-maintenance  takes  M  units  of  time  and  repair 
takes  R  units  of  time.  Initiating  events  occur  according  to  a 
Poisson  process  with  rate  u.  The  system  is  unable  to  respond 
to  an  initiating  event  during  inspection-maintenance  or  repair. 

A  catastrophic  event  is  said  to  occur  if  an  initiating  event 
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occurs  when  the  system  has  failed  or  is  being  inspected, 
maintained,  or  repaired.  Let  T  denote  the  time  of  the  first 
catastrophic  event.  We  will  derive  an  expression  for  the  ex¬ 
pected  value  of  T. 

Let  f(j,k)  =  E.  ,  [T]  denote  the  expected  time  to  the  first 
3  ,K 

catastrophic  event  given  k  =  0,1,..., K  short  inspection  periods 
have  already  successfully  taken  place  and  the  system  is  in  up¬ 
state  j.  Let  f(j,£)  =  E.  .  [T]  denote  the  expected  time  to  first 

3  ,  x. 

catastrophic  event  given  a  successful  inspection  has  just  taken 
place,  the  next  inspection  period  is  long,  and  the  system  is 
in  up-state  j . 

A  probabilistic  argument  gives  the  following  system  of 
equations;  (F ^ (S)  =  1  -  Fj (S) ) . 

f(j,0)  =  6  (S)e'VM{S+M+f (j,l) }  (2.1) 

S 

+  G.(S)  /  (S+u)ve  vu  du 

^  0 

S  S-u+R 

+  /  G.(du)  /  (u+z)ve  dz 

0  3  0 

S  2 

+  /  G. (du)e“v(S_U+R) [S+R+  l  it  .  f  ( j  ,  0 )  J  ; 

0  3  j=l  3 
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f  ( j  #k) 


a(j,S)  +  p( j ,S) f ( j ,k+l)  +  c(j,S)uf(0)  ; 


(2.5) 


=  a  ( j  ,L)  +  p  ( j  ,  L)  f  ( j  ,  £ )  +  c  ( j  ,L)irf  (0) 


where 


irf  (0 )  = 


l  *  f  ( j  #  0 )  ; 

j  =  l  3 


pn(j,S)  =  G. (S)e_VM  ; 


( D  *  S )  —  e 


-v (S+R)  r  vu 


/  e  G  .  (du)  ; 
0  3 


an(3fs>  =  T  t1  -  "  cn ( j '0) ] 


+  G. (S)S  +  /  uG  (du)  ; 

J  0  3 


P(j,t)  = 


Fj (t)e_vM  ; 


c(j,t)  =  e 


-v  ( t+R)  f  vu 


(  vun 

/  e  Fj 


;  (du)  ; 


a(j,t)  = 


±  (1  -  p(jft)  -  c( j ,t) ] 


+  F  .  (t)  t  +  /  u  F  .  (du) 

3  0  3 


In  the  special  case  in  which  F ^  has  an  exponential 
but ion  with  an  atom  at  the  origin. 


(2.6) 

(2.7) 

(2.8) 

(2.9) 

(2.10) 

(2.11) 

(2.12) 

(2.12) 

distri- 
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Fj  ( t)  = 


if  t  <  0  , 


-a  .  t 

( l"a j )  +  oij  (1-e  :  ]  if  t  >  0 


then 


P(j/t)  = 


-6  .  t 

a  e  3  e“vM 

j 


(2.13) 


c(j,t)  =  e 


, .  ^ .  6.  -  (6  .  -v)  t 

-v(t+R)a_a  }  +  a  3  j  (2.14) 

J  J  j 


a(j,t)  =  -  [ 1— p ( j , t )  -  c ( j , t ) ] 


(2.15) 


,  -6  .t 

+  o  i  [1-e  3  ) 

J  3 


Solving  equations  (2.4)- (2.6)  recursively  leads  to  the 
following  expression  for  the  expected  time  to  the  first  cata¬ 
strophic  event  given  the  system  has  just  been  repaired 


TTf  (0)  = 


NUM 

DEN 


(2.16) 


where 


NUM  = 


11  j  [a0  ( j  /S)  +  PQ(j,S)gN 


(K-l) ] 


(2.17) 


where 


K-l 


gN(K-l)  = 


( 1— P  ( i  » S )  .  ..  „.K-1  a  ( j  ,  L)  .  n  ,R, 

- l^pf^Ts)  '  a(3»S)  +  p(D,S)  r-p^TiT)  '•  (2'18) 
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DEN 


2 

ifj  [1-(c0(  j  »s>  +  pQ  ( j  ,S)  gD(K-l)  }] 


(2.19) 


gD  (K-1) 


[l-p(j,S)K~1] 

l-p  ( j  *s) 


c( j ,S)  +  p( j ,S) 


K-l  c( j ,L) 
l-p( j ,L) ' 


(2.20) 


EXAMPLE.  The  rate  of  initiating  events  is  v  =  0.1  per  week. 
=  0.9  =  1  -  it  .  The  length  of  an  inspection-maintenance 
period  M  is  —  weeks.  A  repair  period,  R,  is  weeks. 


0  if  t  <  0  , 


-6  .  t 

( 1-OKI )  +  OKI [1-e  3  ]  if  t  >  0 


F_.(t)  = 


G.  (t) 


if  t  <  0 


-<$  .t 

(1-OKR)  +  OKR[l-e  3  ] 


if  t  >  0 
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Assume  6^  =  per  week,  =  2  Per  wee^*  Note  that  after  a 

repair  the  conditional  expected  time  to  system  failure  given  the 
system  is  up  is  ~  +  ~  =  26  weeks.  Thus,  if  after  a 

repair,  no  inspections  are  done,  then  the  expected  time  to  a 
catastrophic  event  is  (OKR) (26)  +  i  =  (OKR) (26)  +  10  weeks. 

An  exploratory  numerical  study  was  conducted  of  the  best 
values  of  S,  L,  and  K  for  various  values  of  OKI,  OKR.  We 
restricted  our  attention  to  the  case  in  which  inter-inspection 
periods  are  in  integer  numbers  of  weeks.  Equations  ( 2 . 16) - (2 . 20) 
were  evaluated  numerically  for  various  parameter  values.  Some 
results  are  summarized  in  Table  1. 
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Table  1 


Expected 

Best  Time  if 


If  the  quality  of  the  repair  is  better  than  the  quality  of 
inspection  (OKR  >  OKI)  then  it  appears  to  be  better  not  to  in¬ 
spect  often  initially  after  a  repair  but  then  to  inspect  more 
often  as  time  goes  on.  If  OKI  >  OKR  then  it  appears  to  be 
better  to  inspect  soon  after  a  repair  and  if  the  system  is  up 
at  inspection  not  to  inspect  for  a  longer  period  of  time 
thereafter.  If  both  repair  and  inspection  are  of  poor  quality 
then  it  appears  to  be  better  not  to  do  anything.  Note  that  the 
expected  time  to  a  catastrophe  seems  to  be  more  sensitive  to 
OKI  than  to  OKR. 

In  the  remainder  of  the  paper  we  will  study  optimal  inspec¬ 
tion  policies. 
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3.  DISCRETE  TIME,  ONE-UP-STATE,  MARKOV  DECISION  PROCESS  MODELS 
MODEL  1 

In  the  first  model,  the  standby  unit  can  either  be  'up'  or 
'down',  when  it  is  not  in  operation;  and,  if  n  basic  time  periods 
e.g.,  days,  have  elapsed  since  the  unit  was  installed,  sn  is 
the  probability  that  it  will  be  'up'  at  the  next  time  period 
given  that  it  is  'up'  in  this  (the  nt^1)  time  period.  Once  the 
unit  goes  'down'  it  remains  'down'  until  either  it  is  success¬ 
fully  repaired  or  else  a  catastrophic  initiating  event  occurs. 

Each  time  period,  the  operator  can  inspect  the  unit,  repair  it, 
or  do  nothing.  If  the  inspection  finds  the  unit  is  'up',  no 
repairs  are  made,  but  there  is  a  probability  (1-i)  that  the 
inspection  was  actually  hazardous  or  damaging,  and  so  the  unit 
is  'down'  immediately  after  inspection.  An  inspection  which 
finds  the  unit  up  takes  M  periods,  where  M  need  not  be  integer? 
during  this  period  the  unit  cannot  respond  to  an  initiating  event. 
If,  on  inspection,  the  unit  is  found  in  the  down  state,  a  repair 
is  attempted,  which  with  probability  r  will  return  the  unit  to 
the  'up*  state  and  with  probability  (1-r)  leaves  it  in  the  down 
state;  this  takes  a  total  time  of  R  periods  to  perform  (R  ^  M)  ; 
again  the  unit  cannot  respond  to  an  initiating  event  during  this 
period.  If  the  operator  decides  on  a  repair  without  inspection, 
the  unit  is  again  out  of  operation  for  R  periods  and  has  proba¬ 
bility  r  of  being  in  the  'up'  state  immediately  afterwards, 
irrespective  of  whether  it  was  up  or  down  before  the  repair. 

An  initiating  event,  i.e.,  one  that  demands  the  standby 
unit's  services,  occurs  at  random  with  probability  6  each  period. 
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i.e.,  according  to  a  Bernoulli  trials  process,  so  the  times 
between  events  are  independent  and  geometric.  In  this  model  we 
assume  the  operator  is  aware  of  those  initiating  events,  to  which 
the  standby  unit  responded  satisfactorily.  This  implies  the 
unit  was  'up'  at  that  time,  and  although  we  neglect  the  time 
it  was  in  operation,  we  say  there  is  a  (1-c)  chance  that  its  use  will 
have  caused  it  to  go  down  by  the  end  of  the  period.  So  if  it  was  used  at 
the  nfck  period  after  the  unit  was  installed,  there  is  a  probability 
c,  that  it  will  be  'up'  at  the  next  period.  (If  c  =  1,  use  is 
not  hazardous.)  If  the  standby  system  is  down  or  is  being 
inspected  or  repaired  when  an  initiating  event  occurs,  a  cata¬ 
strophic  event  occurs.  The  objective  is  to  maximize  the  expected 
number  of  periods  until  a  catastrophic  event  occurs. 

The  situation  described  can  be  treated  as  an  infinite-state 
Markov  decision  process.  The  state  space  is  describable  as 
S  =  {  (p,n)  ,  0  _<  p  <_  1,  n  =  1,2,...}  where  p  is  our  belief  that 
the  unit  is  'up'  this  period,  and  n  is  the  number  of  periods 
since  the  standby  unit  was  installed.  There  are  three  actions 
open  to  us  at  each  state — do  nothing,  inspect  or  repair.  Let 
V(p,n)  be  the  maximum  expected  number  of  periods  until  a  cata- 
strophic  event,  given  that  this  is  the  n  period  since  installa¬ 
tion,  and  p  is  our  belief  at  this  time  that  the  unit  is  'up'. 

Standard  dynamic  programming  arguments  [14]  show  that  V(p,n) 
satisfies  the  optimality  equation. 

V(p,n)  =  max{W1(p,n),  W2(p,n),  W3(p,n)}  (3.1) 
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where : 


W1(p,n)  =  1  +  (1-3) V(snp,n+1)  +  6pV(c,n+l) 

W2(p,n)  =  p  [  (1- (1-8) M) /8  +  (l-8)MV(i,n+M) ] 

+  (1-p) [ (l-(l-8)R)/8  +  ( 1- 8 ) RV ( r , n+R) ] 

W3(p,n)  =  (l-(l-8)R)/8  +  (1-8)  ^(r,  n+R) 

Note  that 

(1- (1-8) M) /8  =  8  +  26(1-8)  +  36 (1-8) 2  +  ...  +  M(1-6)M_1 

is  the  expected  number  of  periods  to  pass,  up  to  a  maximum  of 
M,  until  an  initiating  event  occurs.  W_.  (p,n)  represents  the 
payoff  from  an  action;  for  example  W1(p,n)  corresponds  to  doing 
nothing,  where  with  probability  (1-8)  no  demand  occurs,  while 
with  probability  6p  an  initiating  event  is  successfully  dealt 
with  and  with  probability  (1-p) 8  a  catastrophic  event  occurs. 
(3.1)  is  an  example  of  Denardo 1 s  contraction  operator  approach 
to  dynamic  programming  [4],  and  hence  the  optimal  policy  is  inde¬ 
pendent  of  the  past  history  of  the  system  and  consists  of 
inspecting  in  state  (p,n)  if  W2(p,n)  >  max{W^(p,n),  W3(p,n)} 
repairing  if  W^(p,n)  >  max{W^(p,n),  W2(p,n)},  otherwise  doing 
nothing. 

As  there  is  a  probability  8(1  -  max{s,  })  of  a  catastrophic 

k  K 

event  within  two  periods  from  any  state  and  under  any  policy. 


we  have 


1/3  <  V(p,n)  <  2/3  (1  -  max(s.) )  .  (3.2) 

k  K 

It  is  easier  to  work  with  V(p,n)  =  V(p,n)  -  1/B ,  which  is  the 
expected  extra  time  until  a  catastrophic  event  because  there 
is  a  standby  unit.  (3.1)  then  becomes 

V(p,n)  =  max{Wr1(p>n)/  W2(p,n),  W3(p,n)}  (3.3) 

where: 

W^pjn)  =  p  +  (1-3)  V(snp,n+1)  +  3pV(c,n+l) 

W2(p,n)  =  p  ( 1-  3 )  ( i ,  n+M)  +  (1-p)  (1-3)  ^(r  ,n+R) 

W3(p,n)  =  (1-3) RV(r,n+R)  . 

Lemma  3.1. 

If  sn  are  non- increasing  in  n  then  V(p,n)  is  convex  and 
nondecreasing  in  p,  and  non-increasing  in  n. 

Proof.  Apply  value  iteration  to  solve  (3.3);  the  iterates 

V  (p,n)  satisfy 
m 

Ip  +  (l-B)Vm(snp,n+l)  +  BpVm(c,n+l) 

p(l-S)M  Vm  {  i  ,  n+M)  +  (1-p)  (1-3^  Vm(r,n4R) 
(1-3)  R  Vm(r,n+R)  .  (3.4) 
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Let  VQ(p,n)  =  0  for  all  p  and  n,  which  is  convex  and  non¬ 
decreasing  in  p  and  non-increasing  in  n.  Since  the  sum  of 
convex  functions,  and  the  maximum  of  convex  functions  is  convex, 
if  Vm(p,n)  is  convex  for  all  p  and  n  so  is  Vm+1  (p,n) .  Thus 
by  induction  Vm(p,n)  is  convex  in  p  and  since  by  [14 ] , 

V  (.,•)  converges  to  V(.,.)  the  solution  of  (3.3),  this  limit 
m 

function  is  also  convex  in  p. 

Again  notice  that  if  V  (p,n)  is  non-decreasing  in  p  for  all 
n,  so  is  p  +  (1-g)  Vm(snp,n+1)  +  3pVm(c,n+l)  since  V^f.,.)  >_  0 
and  also  max{p(l-g)MVm(  i,n+M)  +  (1-p)  (l-e)R  Vm(r,n+R  )  ,  (l-3)RVm(r,n-t}i )  } 
is  non-decreasing  in  p.  Hence  vm+]^Pfn)»  the  maximum  of  these 
two  non-decreasing  functions,  is  non-decreasing  and  the  induction 
step  goes  through.  In  the  limit  as  m  -*•  °°  this  proves  V(p,n) 
is  non-decreasing  in  p. 

For  the  dependence  of  V(p,n)  on  n,  we  again  use  induction 
in  the  iterates  Vm(p,n) :  notice  that  (3.4)  implies 

Vm+l<P'n)  "  Vn(P'n+1J  > 

/{  (1-3)  (Vm(snp,n+1)  -  Vm(sn+1p,n+2))  +  3p(Vm(c,n+l)  -  Vm(c,n+2))  , 

max  J  p(l-3)M(Vm(i,n+M)  -  Vm(i,n+i+M))  +  (1-p)  (1-3)  R(Vm(r,n4-R) 

-  V  (r,n+l+R) ) , 
m 

(1-3)  R(Vm(r ,n+R)  -  Vm(r,n+1+R)  )  }  .  (3.5) 

Assume  Vm(p,n)  >_  Vm(p,n+1)  for  all  p  and  n,  then  the  fact 

V  (p,n)  is  non-decreasing  in  p  means  that,  for  all  p, 
m 


VsnP'n+1)  "  ^m(sn+lP'n+2)  =  (Vsnp'n+1)  "  Vsn+1p,n+l)) 

+  (^m(sn+lp,n+1)  "  ^m(sn+lp,n+2) }  -  0  *  (3,6) 

Hence  (3.5)  gives  Vm+^(p,n)  -  Vm+^(p,n+l)  >_  0  for  all  p  and  n, 
and  the  induction  hypothesis  holds.  Thus,  the  limit  function 
V(p,n)  is  also  non-increasing  in  n. 

These  results  help  to  describe  the  optimal  policy. 

Theorem  3 . 1 

The  optimal  policy  is  given  by  a  set  of  numbers  p*, 
n  =  1,2,...  where,  n  periods  after  installing  the  standby 
system,  one  does  nothing  in  state  (p,n)  if  p  >  p*; 
inspects  if  p  <  p*  and  (l-6)MV(i/n)  (l-g)R V(r ,n)  ;  and 

repairs  if  p  <  p*  and  (l-BJ^fijn)  <  (l-6)RV(r,n)  .  Notice  if 
i  •>  r,  then  one  never  repairs  as  (l-BJ^Cijn)  >  (l-g)RV(r,n) 
for  all  n. 

Proof.  Notice  that  if  (l-g)MV(i,n)  >_  (1-6)  RV(r,n)  ,  then  W2(p,n)  >W3(p,n 

for  all  p;  otherwise  W3(p,n)  >_  W2(p,n).  Now  look  at 

{p|W, (p,n)  <  max  (W.(p,n)}},  which  is  the  set  of  states  (p,n) 
i=2,3  1 

where  it  is  not  best  to  do  nothing.  Since  both  W2(p,n)  and 
W3(p,n)  are  linear  in  p  and  V(p,n)  is  convex,  we  get  for  any 
p^  and  p2  in  the  above  region  and  any  A ,  0  <_  A  <_  1 . 

wi  (Ap-j^t  ( 1— A )  p2  ,n)  =  AWi(p1,n)  +  (l-A)Wi(p2,n) 

=  AV(p1,n)  +  (l-A)  V(p2,n)  V  ( Ap1+ ( 1- A )  p2  ,  n)  (3.7) 
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where  i  =  2  or  3  depending  on  which  is  the  maximum.  Hence  (3.7) 
implies  max  (Ap^  +  (1-Ap2>  _>  ( Ap^  +  ( 1-A  )  )  and  so  the  region 

where  it  is  not  best  to  do  nothing  is  convex. 

From  (2.3)  we  have 

V(0,n)  =  max{ (1-6) V(0,n+1) ,  (l-6)Rv(r ,n+R) }  .  (3.8) 

If  it  were  best  to  do  nothing  at  p  =  0,  this  would  imply 
V(0,n)  =  ( 1-6 ) V ( 0 , n+1 ) ,  which  contradicts  V(p,n)  is  non¬ 
increasing  in  n.  Hence  (0,n)  is  in  the  convex  region  where 
it  is  not  best  to  do  nothing.  Let  p^  be  the  maximum  value  of  p 
in  this  region  and  the  result  holds. 

In  fact  the  model  can  be  rewritten  so  that  the  state  space 
is  countable,  since  not  all  possible  values  of  p  are  possible. 
Let  S  =  { (m,x,n) ,m  =  0,1,2,...,  x  =  i,  r  or  c,  n  =  1,2,3} 
where  (m,x,n)  is  the  state  when  the  unit  is  n  periods  since 
installation  and  m  periods  since  the  end  of  the  last  inspection, 
repair  or  successful  response  to  an  initiating  event;  x  =  i  if 
this  last  occurrence  was  an  inspection  that  found  it  up;  x  =  r 
if  it  was  a  repair  and  x  =  c,  if  it  was  a  successfully  dealt 

with  initiating  event.  The  probability  p  that  the  unit  is  up 

m 

in  this  state  is  p(m,x,n)  =  x  n  s  ,  and  so  the  optimality 

k=l  n 

equation  (3.3)  becomes 
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p(m,x,n)  +  (1-6 ) V (m+1 ,x ,n+l) 


1  +  6p(m,x,n)V(0,c,n+l)  ; 

V(m,x,n)  =  max  p  (m,x  ,n)  ( 1-3^  ( 0  ,  i  ,n+M)  (3.9) 

+  (1-p (m,x ,n) ) (1-6 )RV (0 , r ,n4R) ; 

( 1  -  6 ) R  V  ( 0  ,  r  ,  n+R ) 

and  the  optimal  policy  of  Theorem  3.1  can  be  reinterpreted. 
Corollary  3.1: 

If,  at  n  periods  after  installation,  an  initiating  event 
is  successfully  dealt  with,  inspect  or  repair  next  in  Tc(n) 
periods  unless  there  is  another  initiating  event  before  then; 
if  at  n  periods  after  installation,  the  unit  has  just  been  found 
to  be  'up'  on  inspection,  inspect  or  repair  next  in  T^(n) 
periods  unless  an  initiating  event  occurs;  if  at  n  periods  after 
installation  the  unit  has  just  finished  a  repair,  then  inspect 
or  repair  in  T^fn)  periods  unless  a  prior  initiating  event 
occurs.  If  i  >  r  one  always  inspects,  otherwise  the  repair 
or  inspect  decision  depends  on  the  number  of  periods  since 
installation. 


Proof .  This  is  just  a  matter  of  pointing  out  that 


Tc(n) 


=  min{k|csnsn+1...sn+k  <  Pn+kl  , 


T.(n)  =  min{k|is  s„ 


i 1  JnJn+l  ’  sn+k  <  ^n+k^  ' 
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T  (n)  =  min{k | rs  s  . 

r  ,  1  n  n+l 

k 


’n+k 


pn+k} 


Notice  that  T  (n) ,  T. (n) ,  T  (n)  reflects  the  ordering  of 

C  1  IT 

c,  i  and  r,  so  if  c  >  i  >  r  then  TQ(n)  >_  T^(n)  >_  T|n),  etc. 

The  dependence  of  this  policy  on  n  follows  because  the 
failure  rate  (l-sn)  is  age-dependent.  We  would  expect  that  if 
s^  decreases  with  n,  and  consequently  the  failure  rate  is  in¬ 
creasing,  then  Tc(n),  T^(n)  and  Tr(n)  will  also  be  non- increasing 
in  n.  This  reflects  the  fact  that  in  the  long  run,  the  aging 
of  the  unit  will  lead  to  more  frequent  inspections.  At  the 
moment  we  are  more  interested  in  the  effect  of  inspections  and 
repair  before  aging  starts  to  play  a  part.  The  interesting 
decision  to  replace  an  aging  unit  will  not  be  analyzed  at  this 
time.  From  now  on,  assume  that  the  failure  rate  is  constant, 
which  leads  to  the  following  simplification  of  Model  1. 

Model  2 

Assume  s  =  s  for  all  n  in  Model  1,  and  c  =  i.  This  corres- 
n 

ponds  to  thinking  of  an  initiating  event  successfully  dealt 
with  as  an  inspection  which  takes  zero  time.  The  state  space 
becomes  S  =  { (m,x) ,m  =  0,1,2,  x  =  i,  or  r},  the  optimality 
equation  (3.9)  becomes 


V(m,x) 


!xsm  +  (1-3) V(m+l,x)  +  Bxsm  V(0,i)  ; 

xsm(l-e)M  V(0 , i)  +  (l-xsm)  (1-B) R  V ( 0 , r ) ;  (3.10) 
(1-B)  R  V( 0  ,  r)  . 
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and  the  optimal  policy  is  either  of  the  form  n^dv,Tr)  or 
ir  (Tf'Tr);  Trj_^Ti,Tr^  means  inspect  T\  periods  after  a  success¬ 
ful  response  to  an  initiating  event  and  periods  after  the 
end  of  an  inspection  or  T  periods  after  the  end  of  a  repair, 
unless  another  initiating  event  occurs,  whereupon  inspect  if 
more  periods  elapse  without  another  initiating  event. 

7rr^Ti'Tr^  means  repair  T^  periods  after  a  successf ully-dealt- 
with  initiating  event,  or  Tr  periods  after  last  repair,  unless 
another  initiating  event,  or  T  periods  after  last  repair, 
unless  another  initiating  event  occurs.  Notice  that  one  either 
always  inspects  or  always  repairs  depending  on  the  values  of 
(l-e)MV(0 ,i)  and  (1-B)R V (0 ,r)  . 

Although  the  state  space  is  infinite  we  can  apply  variants 
of  policy  iteration  and  value  iteration  which  solve  the  Markov 
decision  process  to  find  the  optimal  policy  and  optimal  expected 
time  to  a  catastrophic  event.  For  any  policy  )  there 

are  only  +  Tr  +  2  states  the  unit  can  be  in.  So  for,  any 
expected  policy  we  can  calculate  the  corresponding  expected 
time.  Since  the  problem  is  equivalent  to  one  with  discount 
factor  (l-B(l-s)),  we  can  apply  the  bounds  in  White  [22]  to 
find  a  finite  state  approximation,  whose  value  is  within  any 
prescribed  amount  of  the  optimal  value.  These  bounds  tell  us 
how  many  states  (m,x)  we  need  to  consider.  The  results  given 
in  Table  2  are  the  optimal  policy  and  optimal  expected  time  for 
different  values  of  B,  i,  r,  s,  M  and  R ,  together  with  the 
expected  times  under  other  policies.  The  numbers  we  have  chosen 
reflect  an  underlying  model,  in  which  inspections  can  be  scheduled 


at  discrete  times,  say  at  multiples  of  a  week.  However,  a 
repair  or  inspection  takes  only  a  fraction  of  this  time.  Al¬ 
though  our  theory  was  worked  out  for  integer  inspection  and 
repair  times,  we  take  the  same  formula  to  approximate  non¬ 
integer  times.  The  inspection  policy  tk  (1,0)  means  inspect 
one  period  after  last  inspection  or  last  initiating  event  and 
immediately  after  a  repair,  while  71^(0,100+)  means  repair 
immediately  after  any  initiating  event  or  at  least  100  periods 
(100+)  after  a  repair. 

Notice  the  optimal  policy  is  almost  insensitive  to  whether 
3  =  0.05  or  0.01  and  the  expected  time  to  a  catastrophic  event 
is  affected  more  by  increases  in  i  than  r  or  even  s.  The 
policy  T^(n,0)  to  inspect  immediately  after  a  repair  is  optimal 
if  the  probability  of  a  repair  not  being  effective  is  quite 
high,  say  0.4.  Similarly,  the  model  suggests  one  should  not 
inspect  i.e.,  rr^ ( .  , . )  if  inspection  is  more  hazardous  than 
repair,  i  <  r. 

MODEL  3. 

We  might  want  to  change  our  criterion  from  maximizing  ex¬ 
pected  time  until  a  catastrophic  event  to  maximizing  the  proba¬ 
bility  that  the  system  lasts  at  least  n  periods  until  a 
catastrophic  event.  This  might  be  the  case  if  the  unit  is  to 
be  completely  replaced  after  n  periods.  If  we  apply  this 
criterion  to  Model  2,  Pn(p)  the  probability  that  the  system  lasts 
at  least  n  periods  before  a  catastrophic  event,  given  we  believe 
it  is  'up'  at  present  with  probability  p,  satisfies  the 
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OPTIMAL  POLICY  EXPECTED  TIMES  TO  CATASTROPHIC 


optimality  equation 


PQ(p)  =  1  for  all  p. 

1(l-g)Pn(sp)  +  Bp  Pn(i) 

(1-6,N  Pn+1-H(r>  (3-11) 

where  M  =  min(M,n+l) ,  N  =  min(R,n+l).  The  optimal  policy  is 
again  of  a  control- limit  type. 

Theorem  3.2. 

The  optimal  policy  to  maximize  the  probability  of  lasting 
n  periods  is  given  by  the  sequence  p*,p*,...p*,  where  with  k 
periods  to  go,  do  nothing  if  p  >  p*,  inspect  or  repair  if 

p  <  p*;  repair  if  (l-B)^  Pn+1-N(r)  -  Pn+1-M(i)  '  and  insPect 
otherwise. 

Proof.  As  in  Theorem  3.1,  prove  by  induction  that  Pn(p)  is 

convex  and  non-decreasing  in  p  and  non-increasing  in  n.  The 

convexity  of  pn(p)  and  the  linearity  of  the  second  two  terms 

in  the  maximization  in  (3.11)  then  gives  the  result. 

If  the  state  space  is  changed  to  S  =  {(m,x),  m  =  0,1,2,  . .  .  ,x 

or  r),  by  noting  p  =  xsm  at  (m,x) ,  the  obvious  change  occurs  in 

the  optimal  policy.  In  Table  3  we  compare  the  maximum 

chance  of  lasting  n  periods  before  a  catastrophic  event  for 

* 

n  =  10,  50  and  200  with  the  same  chance  under  the  policy  tt 
that  maximizes  the  expected  time  to  a  catastrophic  failure. 
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These  figures  are  similar  to  those  given  for  Model  2  except 
that  the  length  of  period  is  1/10  of  that  there.  So  we  can 
think  of  the  probabilities  as  those  of  lasting  10,  50  or  100 
weeks  without  a  catastrophic  failure.  The  optimal  policy  for 
maximizing  expected  time  until  failure  does  very  well  in  almost 
all  cases. 


Model  4 . 

Suppose  any  information  derived  from  having  successfully 
dealt  with  initiating  events,  as  in  Model  2,  were  ignored; 
what  changes  would  occur?  We  can  no  longer  model  this  as  a 
Markov  decision  process  period  by  period  since  in  these  we  cannot 
ignore  information  we  know.  However,  we  can  construct  a  renewal 
theory  model,  for  each  end  of  inspection  or  end  of  repair  is 
a  type  of  renewal  point.  Thus  we  can  define  V^,  V^,  as  the  maxi¬ 
mum  expected  time  to  a  catastrophic  event  starting  immediately 
after  a  repair  Vr  or  an  inspection  .  The  rest  of  the  model 
is  the  same  as  Model  2,  with  i,  r,  s,  M,  R  having  the  same 
meaning  as  there.  The  optimality  equation  is  then 


T  . 


max { L .  ( T . )  +  is  1 ( ( 1- ( 1-6 ) M) /8  +  (1-3)  V . ) 

ip  1  1 

i 

+  Pi(Ti)  ((1-(1-B)R/B)  +  ( 1-8 )R  vr)  } 


max 

Tr'Wr 


(  Lr(Tr)  +  rs  r(  (1-(1-B)M)/B  +  ( 1-3 ) M  V±)' 

+  pr(Tr)  (  (l-(l-8)R)/3  +  (l-S)RVr); 

W  R 

Lr(Wr)  +  (rs  r  +  pr(Wr) ) ( (l-(l-B)K)/6 


+  (1-8)  )Vr) 


(3.12) 
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T-2  • 

where  L  (T)  =  T  -  J  [l-ps1]  [1- ( 1-6 )  T  ^  is  the  expected  number 
p  i=0 

of  periods,  up  to  a  maximum  of  T  until  a  catastrophic  event 

occurs,  if  p  is  the  probability  the  unit  is  up  at  the  start 

T-l 

of  the  first  period;  and  p  (T)  =  (l-xsT)  -  l  [6  (1-6)  *]  [l-xsT_1_i] 

x  i=o 

is  the  probability  that  after  T  periods  the  unit  is  down  but 
no  catastrophic  event  has  occurred  given  that  initially  it  was 
up  with  probability  x  and  down  with  probability  1-x.  Again  it 
is  easier  to  work  with  vx  =  vx  ~  1/3  and  the  arguments  of  Markov 
renewal  programming  [7],  show  that  the  optimal  policy  is  either 
7ri^Ti,Tr^'  i.e.,  inspect  after  last  inspection  and  T  after 
last  repair,  or  ir  (W  )  ,  i.e.,  repair  Wr  after  last  repair. 

Using  (3.12)  we  can  calculate  under  these  policies.  For 


tt  .  (T.  ,T  ) 
l  i'  r 


T . 

r ( 1-s  r) ( 1- ( 1-6) M  is  L) +i ( 1-s  x) ( 1-0) M  rs 


T. 


V  = 
r 


M 


T . 

(1-s)  [  (1-  (1-3)  M  is  1)  ( 1-  ( 1-6)  R  Pr(Tr))-(l~6)M+R  pi(Ti)rs'r] 

(3.13) 


while  under  tt  (W  ) 
r  r 


W 

r (1-s  r) 


(1-s)  U-(l-6)Rd-pr(Wr)  )  ) 


(3.14) 


We  calculate  the  optimal  policy  for  the  examples  we  did  in 
Model  2,  and  so  it  is  useful  to  compare  the  results  with  those 
given  there.  The  results  can  be  found  in  Table  4. 
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TABLE 


There  are  no  great  changes  in  the  maximum  time  until  a 
catastrophic  event.  Notice  that  there  are  examples  where  model 
5  has  a  longer  expected  time.  This  may  seem  strange  at  first, 
since  in  Model  5,  we  are  ignoring  information — the  occurrence 
of  a  successfully  dealt  with  initiating  event — which  we  use  in 
Model  2.  However  to  counterbalance  this,  in  Model  5,  it  is 
implicit  that  after  a  successfully  dealt  with  initiating  event, 
the  stand-by  system  is  bound  to  be  up,  while  in  Model  2,  it  is 
only  up  with  probability  i.  This  also  explains  the  difference 
in  policy  for  the  fourth  example.  Since  repair  and  inspection 
are  so  bad,  we  do  nothing  to  interfere  with  it  under  Model  5, 
but  in  Model  2  because  after  each  successfully  dealt  with 
initiating  event  there  is  only  a  .5  chance  it  is  up,  we  must 
keep  inspecting  it  to  see  if  this  has  occurred.  Otherwise  the 
only  difference  in  policies  is  that  the  inspection  intervals  are 
slightly  longer  in  Model  5  than  in  Model  2. 
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4.  CONTINUOUS  TIME  MODEL  WITH  ONE  UP  STATE 

In  this  section  we  look  at  the  continuous  time  analogue  of 
the  standby  unit  model  described  in  Section  3.  Again,  the 
standby  unit  can  be  either  'up'  or  'down',  and  remains  down 
either  until  it  is  inspected  and  repaired,  or  until  a  catastrophic 
initiating  event  occurs.  An  inspection  takes  a  time  of  M,  and 
if  the  unit  works  on  inspection,  nothing  is  done,  and  the  life¬ 
time  of  the  unit  thereafter  is  given  by  the  distribution  function 
( • ) .  The  repair  of  a  unit,  found  to  be  'down'  on  inspection, 
takes,  altogether  with  the  inspection,  a  time  of  R  and  the 
lifetime  distribution  function  thereafter  is  F  ( • ) .  (The  discrete 
time  models  have  distribution  functions  corresponding  to  a  point 
mass  at  zero  together  with  a  geometric  distribution.)  The  times 
of  the  initiating  events  are  given  by  a  Poisson  process  with 
parameter  v,  (so  average  inter- initiating  event  time  is  v  *) . 

Again,  we  think  of  an  initiating  event  that  finds  the  unit  up 
as  the  equivalent  of  an  inspection.  The  problem  is  to  find  the 
times  between  inspections  and  between  a  repair  and  the  next 
inspection  which  maximizes  the  expected  time  until  a  catastrophic 
event . 

From  the  work  of  Doshi  [5]  on  continuous  time  Markov  deci¬ 
sion  processes,  it  follows  that  the  optimal  policy  has  a  deterministic 
time  T^  between  inspections  and  a  deterministic  time  Tr,  between 
a  repair  and  the  next  inspection.  Moreover,  if  V\,  (Vr)  are  the 
maximum  expected  time  to  a  catastrophic  event  starting  after 
an  inspection  (repair) ,  (5]  implies  and  Vr  satisfy  the 
optimality  equation: 
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T 

x  _  -vT  -vT 

V  =  sup{  /  ve"Vt(t  +  F  (t)V.)dt  +  T  e  x  +  [e  x  F  (T  ) 
x  T  >0  0  xix  xx 

x— 


M  -vT  N 

(  /  tve“vtdt +Me_vM  +  e"vM  V.  )]+e  x  F  (T  )  (  /  tve-vtdt 
0  1  x  x  0 


Re”vR+ 


e'VRVr)  } 


(4.1) 


where  F(t)  =  1  -  F(t)  and  x  =  i  or  r.  The  and  T  that 
actually  maximize  the  R.H.S.  of  (4.1)  are  the  optimal  inspec¬ 
tion  times.  Again,  it  is  simpler  to  work  with  vx  =  vx  _  1/v, 
which  is  the  improvement  in  expected  time  until  a  catastrophic 
event  when  there  is  a  standby  system,  over  when  there  is  no 
standby  system.  If  V^(T^,Tr),  Vr(T^,T^)  are  these  improvements 
starting  from  an  inspection  and  from  a  repair,  when  inter¬ 
inspection  time  is  and  T  is  the  time  from  repair  to  an 
inspection,  we  get  by  rearranging  (4.1)  that 


VW 


X  ^ 

e"VtFx(t)dt  +  Vi(Ti,Tr) Ie  Xe_VM  Fx(Tx) 


+ 


_  -vT 

F  ( t)  dt )  +  V  (T  .  ,  T  )  e  VRe  *F  CT  )  (4.2) 

A  i  1  IT  A  A 


Solving  the  system  of  equations  (4.2)  we  get 


V. (T. , T  ) 
i  i  r 


A(Ti,Tr)/C(Ti,Tr) ; 


V  (T. ,T  ) 
r  i  r 


B(Ti,Tr)/C(Ti,Tr) 


(4.3) 
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where 


-v(r+T  )_  r 

A (T . , T  )  =  (1-e  r  F  (T  ) )  /  eVCF(t)dt 

r  r  o  1 


-v (R+T . ) _  r 

+  e  F.  (T.)  /  e  vtF  (t)dt  . 


(4.4) 


-v (M+T . ) _  r 

B (T . , T  )  =  (1-e  1  F . ( T . ) )  /  e  VtF  (t)dt 

r  1  1  0  r 


-v (M+T  ) _  Ti 

+  e  r  F  (T  )  /  e~vrF  ■  (t)dt  . 

r  0 


(4.5) 


-v (M+T . ) _  -v (R+T  ) 

C(Ti,Tr)  =  1  -  e  1  Fi(Ti)  -  e  r  Fr(Tr)  + 


-v (M+R+T . +T  )  -v (R+T  )  i 

e  1  r  [ F„(TJ  -  F,  (T,)]-  (1-e  r  F  (T  ))  /  v_VtF 

0 


r  r  i  i 


-  e 


-v  (T .  +R)  r  . 

F.  (T.)  /  ve  ZF  (t)dt 

1  l  n  r 


(4.6) 


If  there  are  optimal  finite  inspection  intervals  T^,  T^,  they 
must  satisfy  for  x  =  i  and  r. 


A'(T.,T  )/A(T.,T  )  =  B'(T.,T  )/B(T.,T  ) 

x  i  r  i  r  x  i  r  i  r 


=  C^(Ti,Tr)/C(Ti,Tr) 


(4.7) 


where 


A^  =  3A/9Ti  and  =  9A/9Tr,  etc. 


i(t)dt 
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In  the  special  case  where  the  extra  time  for  a  repair  is 
zero  and  the  lifetime  of  the  unit  is  the  same  whether  an  in¬ 
spection  or  a  repair  has  just  taken  place,  we  can  show  that 
the  optimal  inspection  times  are  finite.  In  this  case 
=  Vr  =  V,  M  =  R,  Fi(*)  =  Fr(-)  =  F  (  * )  and  =  T, 

so  (4.3)  becomes 

V ( T)  =  A(T)/C(T)  (4.8) 

where 

T  . 

A (T)  =  /  e_V  F(t)dt  (4.9) 

0 

C (T)  =  1  -  e'v(M+T)  -  /  ve"vtF(t)dt  (4.10) 

0 

Lemma  4.1. 

Optimal  inspection  time  T*  is  finite  and 
V (T*)  =  F(T*)/v(eVM  -  F(T*) ) . 

Proof 

At  a  local  maximum  or  minimum  V' (T)  =  0  which  implies 
h (T)  =  A 1 (T) C (T)  -  C ' (T) A (T)  =  0  since  C(T)2  >  0,  where 

h (T)  =  e“vT[F(T)  (l-e_v(M+T))  -  ve-vM  /  e_vtF(t)dt]  ; 

0 

(4.11) 

—  \)rp 

h(0)  is  positive  and  though  h(«)  =  0  notice  that  h(T)  =  e  g(T) 
and  as  T  +  g(T)  <  0.  This  shows  that  T  =  °°  is  a  minimum 
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turning  point  and  that  there  is  a  finite  turning  point  which 
is  a  maximum. 

We  could  repeat  the  whole  analysis  for  the  continuous  time 
analogue  of  the  model  where  we  ignore  successfully-dealt-with 
initiating  events,  or  at  least  do  not  consider  them  inspec¬ 
tions.  Using  the  notation  of  Model  1,  the  optimal  values 
and  Vr  satisfy 


x  t  , ,  .  _  M 

1  =  sup{  /  tdt  j  f  (ujve^  u  du  +  F  (T  )  [T  +  /  tve-  at 

X  Tx>0  0  0  X  x  x  x  0 


+  Me-vM  + e-v 


Vj  +  (  /  f  (t) 

o 


-v(T  -t)  R 

e  x  dt) [T  +  /  tve  v  at 


+  Re'^  +  e-vRV  ]  . 

r 


(4.12) 


The  same  analysis  that  led  to  (4.7)  can  be  applied  to  (4.12)  to 

find  the  optimal  and  Tr .  There  is  a  difference  in  the 

special  case  when  M  =  R ,  ( • )  =  Fr ( • )  =  F ( • ) ,  T^  =  Tr  =  T 

and  V.  =  V  =  V  where  V  =  V  -  1/v. 
l  r 


V (T)  =  D(T)/K(T) 


(4.13) 


where 


D  (T)  =  /  F(u)du 

0 


(4.14) 
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K (T)  =  1  -  e‘v(M+T)(l  +  v  /  F(u)eVUdu)  (4.15) 

0 

Lemma  4.2. 

In  this  special  case  of  Model  2,  a  sufficient  condition  for 
S*,  the  optimal  inspection  interval  to  be  finite  is  that 

*<“>  >  - -  ^"\Jm  {4-16) 

1  +  vye 

00 

where  r(°°)  =  Him  f(s)/F(s)  and  y  =  /  tf(t)dt  <  00  is  the 

S-t-oo  Q 

expected  lifetime. 

Proof . 

~  ~  2 

At  a  local  maximum  or  minimum  of  V(T),  v'(T)  =  h(T)/K(T)  =  0 

where 

h (T)  =  F(T)  (l-e-v (M+T)  -  e_v (M+T)  v  /  F(u)evudu) 

0 

T  T 

-  (  /  F(u)du) ve"v(M+T) (  /  f (u) evudu  +  F (0 ) ) )  (4.17) 

0  0 

Since  K(T)2  >  0,  the  condition  V'(T)  =  0  reduces  to  h(T)  =  0. 
Notice  that  h(0)  =  F(0)  [1-e  vM]  >  0  but  h(°°)  =  0.  Thus  to 
insure  the  maximum  is  not  at  T  =  we  must  show  h  '  (T)  is  posi¬ 
tive  as  T  tends  to  infinity.  Differentiating  h  with  respect 
to  T,  it  follows  that  as  T  tends  to  infinity 

m  9  m  2  -vM 

h'(T)  +  -r(oo)  (1  +  yve"vM)  +  v  ye-vM  ( vb-1)  +  -  (4.18) 

a 


35 


where 


r  (<=°) 


Aim  f(T)/F(T)  =  lim  r(T) ,  a  =  Aim  evTF(T)  , 
T-+00  T-boo  T+oo 


and 


T  _  _ 

b  =  Him  (  /  F(y)evydy)/F(T)ev  .  (4.19) 

T>oo  o 

T 

If  F(T)evT  =  exp(-  /  (r(t)-v)dt)  -*■  c  as  T  -*•  «>  then  b  -+  « 

0 

and  h'(<»)  is  positive;  this  certainly  occurs  if  r(°°)  >  v.  if 
F(T)e  -*■  °°  as  T  -+  °° ,  then  L'Hopital's  Rule  says 


b 


Him 

T-^OO 


F (T) evT _ 

vF  (T) evT  -  f (T)evT 


1 

v-r(«>) 


(4.20) 


Thus 


h  '  (T)  -*■  -r  (°°)  (1  + 


-VM  . 
e  vy ) 


2  -vM 
v  ye 


r  (°°) 
v-r  (°°) 


(4.21) 


•  •  —  vT 

Since  we  are  assuming  F(T)e  -*•  °°  we  have  r(«>)  <_  v.  If 

r(o°)  <  v  then  on  checking  when  (4.21)  is  positive  we  get  (4.16). 

Finally  if  r(»>)  =  v,  then  b  =  °°  and  h'(T)  is  still  positive  at  T 

As  an  example  suppose  F(t)  =  weAt,  t  >  0  so  the  unit  has 

exponential  lifetime  with  a  probability  1-w  of  instantaneous 

failure,  then  the  optimal  inspection  time  T  satisfies 

e-(v+A)T[A-vw)-(2v+A)e-vMe-AT  +  v(w+l)e-(v+A)T  +  ve-vVT]  =  0 

(4.22) 
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and  the  condition  (4.16)  that  guarantees  a  finite  solution  to 
this  equation  is  A  >  v(l  -  we_vM) . 
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5.  TWO-UPSTATE  MODEL 
Model  7 

We  extend  Model  2  of  Section  3  to  allow  the  unit  to  be  in 
either  one  of  two  different  up  states:  1-up  and  2-up,  which 
have  different  failure  rates.  Let  s^,  i  =  1,2  be  the  proba¬ 
bility  of  remaining  in  state  i  next  period  given  that  it  is  in 
state  i  this  period,  and  1-s^  is  the  probability  it  will  fail 
in  the  next  period.  This  model  is  intended  to  describe  the 
situation  in  which  a  repair  might  only  correct  minor  faults 
that  caused  the  failure  and  not  the  underlying  problem,  which 
caused  and  will  continue  to  cause  these  faults.  We  take  as  our 
state  space  S  =  (p,g)  |0  £  p  <_  1,  0£g<^°°},  where  p  is  the 
belief  that  the  unit  is  up,  and  g  is  the  ratio  of  the  probability 
the  unit  is  in  the  1-up  state  to  the  probability  it  is  in  the 
2-up  state.  Thus  in  the  state  (p, g)  the  belief  the  unit  is 
down,  in  the  1-up  state  and  the  2-up  state  are  respectively 

l-p#  gp/g+i,  p/g+i. 

We  assume  that  after  a  repair  the  unit  is  in  state  (r,w)  and 
define  a  =  s-^/s^,  where  without  loss  of  generality,  we  assume 
s^  S2«  The  occurrence  of  a  successf ully-dealt-with  initiating 
event  is  treated  as  an  inspection  which  takes  no  time.  Let 
V(p,g)  be  the  maximum  extra  number  of  periods  under  the  best 
inspection  policy  until  a  catastrophic  event,  than  if  there  was 
no  standby  unit  (i.e.,  same  definition  as  in  Section  2) . 

Again,  Denardo's  results  [4]  guarantee  the  optimal  policy  to 
be  a  deterministic  one,  it  satisfies  the  optimality  equation 
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v(P/g) 


(5.1) 


=  max{W^ (p,g) ,  W2(p,g),  W3(p,g)} 

W-j^tp/g)  =  p  +  (l-6)V(s2p(ag+l)/(g+l)  ,ag)  +  BpV(i,g) 

W2(p,g)  =  p(l-g)M  V(i,g)  +  (1-p) (1-6)R  V(r,w) 

w3(p,g)  =  (1-$)R  V ( r  ,w)  . 

The  assumption  is  that  an  inspection  affects  the  probability 
the  unit  is  up,  but  not  the  ratio  between  the  two  up  states, 
whereas  a  repair  always  returns  the  unit  to  the  state  (r,w) . 
(s2p(ag+l)/g+l,ag)  is  the  Bayesian  updated  belief  of  the  state 
(p,g) ,  using  the  fact  that  no  initiating  event  occurred.  The 
optimal  policy  for  this  model  is  given  as  follows. 

Theroem  5.1. 

The  optimal  policy  is  given  by  a  function  p*(g)  and  a 
number  g*  so  in  state  (p,g) ,  it  does  nothing  if  p  >  p*(g), 
inspects  if  p  <  p*(g) ,  g  >  g*,  and  repairs  if  p  <  p*(g) ,  g  £  g*. 

Proof 

As  in  Theorem  3.1  an  inductive  proof  on  the  iterates  of 
value  iteration  proves  that  V(p,g)  is  convex  and  non-decreasing 
in  p  and  non-decreasing  in  g.  Now  define 

Wg  =  (p|V(p,g)  >  W3(p,g)};  then  the  linearity  of  W2  and  W3  and 
the  convexity  of  V  in  p  guarantees  Wg  is  convex,  just  as  in 
Theorem  3.1.  V(0,g)  =  V(0,g')  since  if  p  =  0  there  is  only 

one  state.  From  (5.1)  it  follows  that 
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V  ( 0  ,  g )  =  max{ (1-B) V(0,ag) ,  (1-B)  V(r,w) 


(5.2) 


By  definition  V(r,w)  >_  0  and  if  V(0,g)  =  W1(0,g)  =  (l-B)V(0,ag)  = 
(l-B)V(0,g)  then  V(0,g)  =  0,  and  hence  0  t  Wg.  Thus  Wg  =  [0,p*(g)l 
and  result  holds,  g*  satisfies  (1-3)^  V(i,g*)  =  (1-B)N  V(r,w); 
and  since  V(i,g)  is  non-decreasing  in  g  this  gives  the 
division  between  inspection  and  repair. 

Again  we  can  rewrite  the  state  space  in  terms  of  the  number 
of  periods  since  the  last  inspection  and  the  last  repair.  Let 
S  -  {  (m,n)  |0  <_  m  £  n  <_  °°}  where  (m,n)  is  the  state  which  is  m 
periods  since  the  end  of  the  last  inspection  or  the  end  of 
repair  if  it  followed  from  the  last  inspections  and  n  non¬ 
inspection  periods  since  the  last  repair.  The  state  (m,n)  is 
equivalent  to  g  =  anw. 


m •  >  n  .  >  ft  n— m  .  > 
s  l (a  w+1) / (a  w+1) 


n  >  m  , 


p(m,n)  = 


/ 

(  m  ,  n 
V  s  r (a  1 


(5.3) 


(anw+l) / (an-mw+l) 


If  we  define  p(m,n)  according  to  (5.3),  the  optimality  equation 
for  this  state  space  is 


p(m,n)  +  ( 1-B ) V (m+1 ,n+l)  +  Bp (m,n) V( 0 ,n) 


V(m,n)  =  max  7  p(m,n)  (1-B) M  V(0,N)  +  (l-p(m,n) )  (1-B)R  V(0,0) 


(1-B)  V( 0 , 0 ) 
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Theorem  5.1  can  be  reinterpreted  for  this  state  space. 

Corollary  5.1. 

The  optimal  policy  is  given  by  a  function  m*(n)  and  a 
number  n*  so  that  at  (m,n) ,  do  nothing  if  m  <  m*(n) ;  inspect 
if  in  >  m*(n)  ,  n  >  n*;  repair  if  m  >  m*(n)  ,  n  <_  n*.  Notice  if 
i  >_  r,  n*  =  0  and  we  always  inspect. 

Again  we  can  use  value  iterations  on  a  finite  state  approxi¬ 
mation  of  the  Markov  decision  model  given  by  (5.4)  (see  White 
[22]  for  the  bounds) .  This  gives  us  the  results  found  in  Table  5 
namely  the  optimal  periods  for  inspections,  counting  from  the 
last  repair. 

Note  that  the  optimal  inspection  pattern  appears  to  have 
short  inter-inspection  times  just  after  a  repair,  which  gradually 
increase  to  long  inspection  times,  provided  the  system  continues 
to  be  found  up  upon  inspection.  Hazardous  inspection  (i  small) 
has  a  more  drastic  effect  on  the  expected  time  to  a  catastrophic 
failure  than  similar  changes  in  r,  or  s^  and  s^. 

Model  8 

As  in  Section  3,  we  could  also  model  the  situation  in  which 
the  information  acquired  from  successfully-dealt-wi th  initiating 
events  is  ignored.  Then  B,  i,  s^,  S2 ,  M,  R  are  still  defined 
as  in  Model  7,  but  immediately  after  an  inspection  or  repair 
the  time  to  the  next  inspection  or  repair  is  determined,  and 
which  kind  it  will  be.  Immediately  after  a  repair  suppose  the 
unit  has  probability  r^,  r^  respectively  of  being  in  the  1-up 
or  2-up  state.  The  decision  points  are  immediately  after  a 


repair,  and  immediately  after  an  inspection,  where  it  is  important 
to  know  the  number  of  operating  periods  n  since  the  last  repair. 

We  denote  the  maximum  expected  times  until  a  catastrophic  event 
at  these  decision  points  as  Vr ,  n  respectively.  As  in  Model  4 
we  can  write  down  the  optimality  equation  connecting  these  values: 


VWr 


L(r,Wr)  +  (1-f  (r,Wr)  )  (  (  (1-  (l-3)R)/3  +  (l-(3)RVr)  ; 

T  T 

L(r,T_)  +  (r1s1r  +r2s2r)  (  (1- (l-0)M/3)  +  (l-sA'.^  ) 
T  T 

+  (l-r1s1r  -  r2s2r  -  f (r,Tr) )  ( (1- ( 1-B ) R/6  +  (1-3) RVr) 


V. 

i,n 


T .  T . 

L(i(n)  ,T±  )  +(i(n)1s11'n  +  i(n)2s21'n)  ((l-(l-eP  )/l 

+  (1-^Mvi,n+T.  +(l-i(n)18ii'n-i(n)  2821'n 

i  ,n 


v  -  f (i(n) ,Ti>n)  ((l-(l-B)  R/3  +  (l-B)RVr) 


(5.5) 


where  r  =  (^ , r2 , l-r1~r2)  .  If  p  =  (Py'P2'P3^  where  p-^  is  the 
probability  of  being  in  the  1-up  state,  p2  is  the  probability  of 
being  in  the  2-up  state,  and  p^  is  the  probability  of  being 
down,  then 


T-l  T-2 

L(p,T)  =  T  -  p,  l  (1-  (1-B) k)  -  P-.  I  (l-(l-B)T'k'1)  (1-sh 
J  k=l  k=l 


p2  l  (  (1- (1-B) T-k-1)  (l-s2) 
z  k=l  * 
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is  the  expected  time  until  a  catastrophic  failure  in  first  T 
periods  starting  in  state  jd. 


irlsi 


ir2S2 


n  n '  n^  n 
rlsl+r2s2  rlsl+r2s2 


,1-i)  is  the  state  of  the  system 


after  inspection  n  operating  periods  after  last  repair,  while 


f  (R/T)  =  (l-S)f  (p,T-l)  +p13(l-s'[  1)  +p2B(l-S2_1)  +  Bp3  (5.7) 


is  the  probability  there  has  been  a  catastrophic  failure  within 
T  periods,  starting  in  state  p.  Again,  the  general  results  of 
Markov  renewal  programming  [7]  show  that  the  only  possible 
optimal  policies  are  irr(W),  i.e.,  repair  every  W,  or  tk{To,T^,T 
which  is  inspect  TQ  periods  after  a  repair,  and  periods 
after  the  kth  inspection  after  a  repair.  In  order  to  find  the 
optimal  policy  it  is  easier  to  work  with  V  =  V  1/3  again, 
and  using  (5.5)  we  can  show  that  under  the  policy  tt^  ( W)  if 

-  =  (rl,r2'1-rl~r2) 

r. (1-s^)  r_(l-s^) 

Vr  =  (  (1-T)-  +  •~ti~-s~)~)  /  ( 1  -  (1-f  (r,W))(l-B)K)  .  (5.8) 


Under  the  policy  tk  (Tq,T^, . . . )  we  get  the  following  equations 
T  T 

r. (1-s.0)  r  (1-s  °)  T  T  ~ 

Vr  “  “TiTi^T  +  +  <risi  +  r2s2  J  1  (1-e)M  vi,TQJ 


+  (1  -  r1s1° -r2s2°  -  f  (r,To))  (1-S)R  Vr  . 


(5.9) 
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If  T 


k-1 


To  +  Ti  +  ...  +  Tk_1 


-^k-l*  1  (1-sl  >  .  — ^ 1  k-1^  2  ^ 1  S2  ) 


k-1 


(l-s1)  +  (l-s2) 


+  (i<Tk-l)18Ik  +  i(Tk-1,282k)(1-B,M  Vi,T, 


T.  T, 

+  (1-i  (xk_1)  1s1  “  i(Tk_]_)  2s2 


R 


-  f (i(xk_1) ,Tk) (1-8)  Vr 


(5.10) 


It  appears  somewhat  difficult  to  solve  (5.9)  and  (5.10)  as  we 

have  an  infinite  set  of  equations.  However,  we  can  assume  for 

all  t,  >  N,  for  some  N,  V.  is  approximately  constant,  since 

Jc  -  i,Tk 

if  a  large  number  of  periods  have  passed  since  the  last  repair, 

with  no  intervening  failure,  it  is  a  good  approximation  to 

assume  the  unit  is  in  the  better  of  the  two  up  states.  This 

enables  us  to  solve  these  equations  using  the  bisection  method 

reviewed  in  Thomas  [1S|  .  The  method  depends  on  the  fact  that  if  we 

substitute  =  c  in  the  R.H.S.  of  (5.9)  and  (5.10)  we  can  work 

back  and  solve  for  V  on  the  L.H.S.  of  (5.9) .  If  c  is  the 

r 

correct  value  of  V  ,  the  L.H.S.  of  (5.9)  is  c,  but  if  c  >  V^, 
it  follows  easily  that  the  L.H.S.  of  (5.9)  will  be  greater  than 
c,  while  if  c  <  it  will  be  smaller  than  c.  Using  this  as 
the  basis  of  the  bisection  method  and  taking  all  inspections 
more  than  50  periods  after  a  repair  as  the  same,  we  get  the 
forms  of  the  approximately  optimal  policies  found  in  Table  6; 

(the  units  of  time  are  weeks) . 
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The  parameters  in  the  comparable  continuous  time  model  of 

2 

Section  2  are  (in  units  of  weeks):  v  =  01,  =  -j  =  1  - 

6^  =  .04,  <$2  =  0.5,  M  =  0.035  and  R  =  0.07.  The  corresponding 
best  policies  under  the  "short-long"  inspection  rule  of  Section 
2  with  inter-inspection  times  restricted  to  being  multiplies 
of  a  week  are  as  follows: 

Table  7 


Case 

OKI 

OKR 

Best  Policy 

Best  expected  time  to 
a  catastrophic  Event 

I 

0.9 

0.9 

1 

(4  times) ,  2 

61.09 

II 

0.9 

0.5 

1 

( 2  times) ,  3 

42.19 

III 

0.5 

0.9 

3 

(1,  time) ,  1 

29.37 

IV 

0.5 

0.5 

3 

(1  time)  ,  00 

19.98 

The  difference  in  policies  for  Case  III  results  from  the 
fact  that  the  discrete  time  model  allows  a  decision  of  repair 
without  inspection.  The  differences  in  the  policies  for  cases 
I,  II,  and  IV  come  about  because  the  continuous  time  model  only 
allows  inspection  periods  of  two  different  lengths  whereas 
the  optimal  policy  in  the  discrete  time  model  goes  gradually 
from  the  length  of  the  inspection  period  just  after  a  repair 
to  an  asymptotic  inspection  period  if  the  inspections  are 
successful.  However,  subject  to  its  restrictions,  the  policy 
of  the  continuous  time  model  is  comparable  to  that  of  the  dis¬ 
crete  time  model. 


47 


The  differences  between  the  best  expected  times  to  a 
catastrophic  event  in  the  two  models  results  from  the  discreti¬ 
zation  of  time  in  Model  8.  If  the  time  interval  in  the  discrete 
time  model  of  Case  I  is  taken  to  be  1/10  week  instead  of  1  week 
with  the  resulting  change  of  parameters  8  =  .01,  i  =  0.9, 
rx  =  0.6,  r2  =  0.3,  s1  =  .996,  s2  =  .95,  M  =  0.35,  R  =  0.7, 
then  the  optimal  policy  is  inspect  7  periods  after  a  repair, 
and  if  up,  then  8  periods  later,  then  9,  11,  13,  16,  18  and 
20  periods  and  the  expected  time  until  catastrophic  failure  is 
626.0  periods.  In  the  original  time  scale  this  is  a  time  of 
62.6  weeks.  Note  that  the  difference  between  the  expected  time 
to  a  catastrophic  event  is  now  small  for  the  two  models.  This 
suggests  that  the  policy  that  was  proposed  in  Section  2,  while 
not  optimal,  is  a  good  one. 
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6 .  CONCLUSIONS 


The  following  conclusions  can  be  drawn  about  the  form  of 
the  optimal  policy,  by  studying  the  models  in  this  paper. 

1)  If  the  failure  rate  of  the  system  increases  with  age, 
then  the  inspection  intervals  should  decrease,  and  do.  Numeri¬ 
cal  examples  based  on  Model  1  have  borne  this  out.  The  model 
calculations  suggest  optimal  intervals  based  on  the  underlying 
parameters . 

2)  If  there  is  only  one  state  the  unit  can  be  in  when  it 
is  'up',  and  the  probability  of  being  up,  i,  is  the  same  after 
each  inspection  and  the  probability  of  being  up  after  a  repair 
is  also  a  constant  r,  then  the  optimal  policy  is  to  have  one 
'short'  inspection  interval  after  a  repair,  and  a  'longer' 
inspection  interval  always  thereafter  (i  >  r)  or  else  to  repair 
at  fixed  intervals  with  no  inspection  (r  considerably  larger 

than  i) .  The  'longer'  inspection  interval  must  always  be  at  least  as 
long  as  the  'short'  initial  inspection  interval. 

3)  The  results  of  1)  and  2)  hold  whether  or  not  successfully-dealt 
with  initiating  events  are  considered  as  a  type  of  inspection. 

However,  there  are  considerable  differences  in  the  actual  in¬ 
spection  periods  for  these  two  cases. 

4)  in  order  for  the  optimal  inspection  problem  to  require 
several  'short'  inspection  intervals  followed  by  longer  ones 

it  is  necessary  to  assume  the  unit  can  be  in  more  than  one  'up' 
state  with  different  failure  rates.  In  this  case  there  is 
not  an  abrupt  jump  from  'short'  inspection  intervals  to  'long', 
but  a  gradual  increase  in  the  inspection  interval.  However, 
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there  is  a  suggestion  that  a  policy  comparable  to  the  optimal 
one  in  which  there  is  a  sharp  jump  between  short  inspections 
and  long  ones,  will  give  the  expected  time  to  a  catastrophic 
event  that  is  close  to  that  achieved  by  the  optimal  policy. 
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Australian  National  University 
Canberra  A.C.T.  2606 
AUSTRALIA 

Mr.  DeSavage 

Naval  Surface  Weapons  Center 
Silver  Springs,  MD  20910 

Professor  C.  Derman 

Dept,  of  Civil  Eng.  &  Mech.  Engineering 
Columbia  University 
New  York,  NY  10027 

Dr.  Guy  Fayol le 
I . N . R. I . A. 

Dorn  de  Voluceau-Rocquencourt 
78150  Le  Chesnay  Cedex 
FRANCE 
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Dr.  M.  J.  Fischer 

Defense  Communications  Agency 

1860  Wiehle  Avenue 

Reston,  VA  22070 

1 

Professor  George  S.  Fishman 

Cur.  in  OR  &  Systems  Analysis 

University  of  North  Carolina 

Chapel  Hill,  NC  20742 
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Dr.  R.  Gnanadesikan 

Bell  Telephone  Lab 

Murray  Hill,  NJ  07733 
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Professor  Bernard  Harris 

Department  of  Statisr.es 

University  of  Wisconsin 

610  Walnut  Street 

Madison,  WI  53706 
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Dr.  Gerhard  Heiche 

Naval  Air  Systems  Command  (NAIR  03) 
Jefferson  Plaza,  No.  1 

Arlington,  VA  20360 

1  'i 
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Professor  L.  H.  Herbach 

Department  of  Mathematics 

Polytechnic  Institute  of  N.Y. 

Brooklyn,  NY  11201 

1  j 

. 

Professor  w.  M.  Hinich 

University  of  Texas 

Austin,  TX  78712 
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i  ' 

P.  Heidelberger 

IBM  Research  Laboratory 

Yorktown  Heights 

New  York,  NY  10598 
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W.  D.  Hibler,  III 

Geophysical  Fluid  Dynamics 

Princeton  University 

Princeton,  NJ  08540 
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Professor  D.  L.  Iglehart 

Department  of  operations  Research 

Stanford  University 

Stanford,  CA  94350 
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Dr.  D.  Vere  Jones  1 

Department  of  Mathematics 

Victoria  University  of  Wellington 

P.  O.  Box  196 

Wellington 

NEW  Z ELAND 

Professor  J.  B.  Kadane  1 

Department  of  Statistics 
Carnegie -Me lion 
Pittsburgh,  PA  15212 

Professor  Guy  Latouche  i 

University  Libre  Bruxelles 
C.P.  212 

Blvd  De  Triomphe 
B-1050  Bruxelles 
BELGIUM 

Dr.  Richard  Lau  1 

Office  of  Naval  Research 

Branch  Office 

1030  East  Green  Street 

Pasadena,  CA  91101 

A.  J.  Laurance  1 

Dept,  of  Mathematics  Statistics 

University  of  Birmingham 

P.  O.  Box  363 

Birmingham  B15  2TT 

ENGLAND 

Dr.  John  Copas  1 

Dept,  of  Mathematics  Statistics 

University  of  Birmingham 

P.  O.  Box  363 

Birmingham  B15  2TT 

ENGLAND 

Professor  M.  Leadbetter  1 

Department  of  Statistics 
University  of  North  Carolina 
Chapel  Hill,  NC  27514 

Mr.  Dan  Leonard  1 

Code  8105 

Naval  Ocean  Systems  Center 
San  Diego,  CA  92132 
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M.  Lepparanta 

Winter  Navigation  Res.  Bd . 

Helsinki 

FINLAND 


J.  Lehoczky 

Department  of  Statistics 
Carnegie-Mellon  University 
Pittsburgh,  PA  15213 

Library 

Naval  Ocean  Systems  Center 
San  Diego,  CA  92132 

Li brary 
Code  1424 

Naval  Postgraduate  School 
Monterey,  CA  93943 

Dr.  J.  Maar  (R51) 

National  Security  Agency 
Fort  Meade,  MD  2(P55 

Bob  Marcello 

Canada  Marine  Engineering 

Ca lgary 

CANADA 

Dr.  M.  McPhee 

Chair  of  Arctic  Marine  Science 
Oceanography  Department 
Naval  Postgraduate  School 
Monterey,  CA  93943 

Dr.  M.  Mazumdar 

Dept,  of  Industrial  Engineering 
University  of  Pittsburgh 
Oakland 

Pittsburgh,  PA  15235 

Professor  Rupert  G.  Miller,  Jr. 
Statistics  Department 
Sequoia  Hall 
Stanford  University 
Stanford,  CA  94305 

National  Science  Foundation 
Mathematical  Sciences  Section 
1800  G  Street,  NW 
Washington,  DC  20550 


Naval  Research  Laboratory 
Technical  Information  Section 
Washington,  DC  20375 

Professor  Gordon  Newell 
Dept,  of  Civil  Engineering 
University  of  California 
Berkeley,  CA  94720 

Dr.  David  Oakes 

TUO  Centenary  Inst,  of  Occ.  Health 
London  School  of  Hygiene/Tropical  Med. 
Keppel  St.  (Gower  St.) 

London  WOl  E7Hl 
ENGLAND 

Dr.  Alan  F.  Petty 
Code  7930 

Navy  Research  Laboratory 
Washington,  DC  20375 

E.  M.  Reimnitz 

Pacific-Arctic  Branch-Marine  Geology 
U.  S.  Geological  Survey 
345  Middlef ield  Rd.,  (MS99) 

Menlo  Park,  CA  94025 

Prof.  M.  Rosenblatt 
Department  of  Mathematics 
University  of  California  -  San  Diego 
La  Jolla,  CA  92093 

Professor  I.  R.  Savage 
Department  of  Statistics 
Yale  University 
New  Haven,  CT  06520 

Professor  W.  R.  Schucany 
Department  of  Statistics 
Southern  Methodist  University 
Dallas,  TX  75222 

Professor  D.  C.  Siegmund 
Department  of  Statistics 
Sequoia  Hall 
Stanford  University 
Stanford,  CA  94305 
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Professor  H.  Solomon 
Department  of  Statistics 
Sequoia  Hall 
Stanford  University 
Stanford,  CA  94305 

Dr.  Ed  Wegman 

Statistics  &  Probability  Program 
Code  411 (SP) 

Office  of  Naval  Research 
Arlington,  VA  2221"7 

Dr.  Douglas  de  Priest 
Statistics  &  Probability  Program 
Code  411 (SP) 

Office  of  Naval  Research 
Arlington,  VA  22217 

Dr.  Marvin  Moss 

Statistics  &  Probability  Program 
Code  4 1 1 ( SP ) 

Office  of  Naval  Research 
Arlington,  VA  22217 

Technical  Library 
Naval  ordnance  Station 
Indian  Head,  MD  20640 

Professor  J.  R.  Thompson 
Dept,  of  Mathematical  Science 
Rice  University 
Houston,  TX  7'7001 

Professor  J.  W.  Tukey 
Statistics  Department 
Princeton  University 
Princeton,  NJ  08540 

P.  Wadhams 

Scott  Polar  Research 
Cambridge  University 
Cambridge  CB2  1ER 
ENGLAND 

Daniel  H.  Wagner 
Station  Square  One 
Paoli,  PA  19301 

Dr.  W.  Weeks 
U.  S.  Army  CR  REL 
7 2  Lyme  Road 
Hanover,  NH  03755 


P.  Welch 

IBM  Research  Laboratory 
Yorktown  Heights,  NY  10598 

Pat  Welsh 

Head,  Polar  Oceanography  Branch 
Code  332 

Naval  Ocean  Research  &  Dev.  Activi  ty 
NSTL  Station 
Mississippi  39529 

Dr.  Roy  Welsch 
Sloan  School 

M. I .T. 

Cambridge,  MA  02139 

Dr.  Morris  DeGroot 
Statistics  Department 
Carneg ie-Mel Ion  University 
Pittsburgh,  PA  15235 

Professor  R.  Renard 
Head,  Meteorology  Department 
Naval  Postgraduate  School 
Monterey,  CA  93943 

Dr.  A.  Weinstein 

Commanding  Officer 

Naval  Environmental  Prediction 

Research  Facility 

Monterey,  CA  93943 

Paul  Lowe 

Naval  Environmental  Prediction 
Research  Facility 
Monterey,  CA  93943 

Wayne  Sweet 

Naval  Environmental  Prediction 
Research  Facility 
Monterey,  CA  93943 

Dr.  Colin  Mallows 

Bell  Telephone  Laboratories 

Murray  Hill,  NJ  O'^'M 

Dr.  U.  Preyibon 

Bell  Telephone  Laboratories 

Murray  Hill,  NJ  0’’9'74 
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Dr.  Jon  Kettenring 

Bell  Telephone  Laboratories 

Murray  Hill,  NJ  07974 

Professor  Grace  Woehba 
Department  of  Statistics 
University  of  Wisconsin 
1210  W.  Dayton  St. 

Madison,  WI  53706 
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